126 research outputs found

    PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning

    Full text link
    Online class-incremental continual learning is a specific task of continual learning. It aims to continuously learn new classes from data stream and the samples of data stream are seen only once, which suffers from the catastrophic forgetting issue, i.e., forgetting historical knowledge of old classes. Existing replay-based methods effectively alleviate this issue by saving and replaying part of old data in a proxy-based or contrastive-based replay manner. Although these two replay manners are effective, the former would incline to new classes due to class imbalance issues, and the latter is unstable and hard to converge because of the limited number of samples. In this paper, we conduct a comprehensive analysis of these two replay manners and find that they can be complementary. Inspired by this finding, we propose a novel replay-based method called proxy-based contrastive replay (PCR). The key operation is to replace the contrastive samples of anchors with corresponding proxies in the contrastive-based way. It alleviates the phenomenon of catastrophic forgetting by effectively addressing the imbalance issue, as well as keeps a faster convergence of the model. We conduct extensive experiments on three real-world benchmark datasets, and empirical results consistently demonstrate the superiority of PCR over various state-of-the-art methods.Comment: To appear in CVPR 2023. 10 pages, 8 figures and 3 table

    MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning

    Full text link
    Few-Shot Learning (FSL) is a challenging task, \emph{i.e.}, how to recognize novel classes with few examples? Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then predicting novel classes via a cosine nearest neighbor classifier with mean-based prototypes. Nevertheless, due to the data scarcity, the mean-based prototypes are usually biased. In this paper, we attempt to diminish the prototype bias by regarding it as a prototype optimization problem. To this end, we propose a novel meta-learning based prototype optimization framework to rectify prototypes, \emph{i.e.}, introducing a meta-optimizer to optimize prototypes. Although the existing meta-optimizers can also be adapted to our framework, they all overlook a crucial gradient bias issue, \emph{i.e.}, the mean-based gradient estimation is also biased on sparse data. To address the issue, we regard the gradient and its flow as meta-knowledge and then propose a novel Neural Ordinary Differential Equation (ODE)-based meta-optimizer to polish prototypes, called MetaNODE. In this meta-optimizer, we first view the mean-based prototypes as initial prototypes, and then model the process of prototype optimization as continuous-time dynamics specified by a Neural ODE. A gradient flow inference network is carefully designed to learn to estimate the continuous gradient flow for prototype dynamics. Finally, the optimal prototypes can be obtained by solving the Neural ODE. Extensive experiments on miniImagenet, tieredImagenet, and CUB-200-2011 show the effectiveness of our method.Comment: Accepted by AAAI 202

    Magnesia-stabilised zirconia solid electrolyte assisted electrochemical investigation of iron ions in the SiO2-CaO-MgO-Al2O3 molten slag at 1723 K

    Get PDF
    Production of metallic iron through molten oxide electrolysis using inert electrodes is an alternative route for fast ironmaking without CO2 emissions. The fact that many inorganic oxides melt at ultrahigh temperatures (>1500 K) challenges conventional electro-analytical techniques used in aqueous, organic and molten salt electrolytes. However, in order to design a feasible and effective electrolytic process, it is necessary to best understand the electrochemical properties of iron ions in molten oxide electrolytes. In this work, a magnesia-stabilised zirconia (MSZ) tube with a closed end was used to construct an integrated three-electrode cell with the “MSZ | Pt | O2 (air)” assembly functioning as the solid electrolyte, the reference electrode and also the counter electrode. Electrochemical reduction of iron ions was systematically investigated on an iridium (Ir) wire working electrode in the SiO2-CaO-MgO-Al2O3 molten slag at 1723 K by cyclic voltammetry (CV), square wave voltammetry (SWV), chronopotentiometry (CP) and potentiostatic electrolysis (PE). The results show that the electro-reduction of the Fe2+ ion to Fe on the Ir electrode in the molten slag follows a single two-electron transfer step, and the rate of the process is diffusion controlled. The peak current on the obtained CVs is proportional to the concentration of the Fe2+ ion in the molten slag and the square root of scan rate. The diffusion coefficient of Fe2+ ions in the molten slag containing 5 wt% FeO at 1723 K was derived to be (3.43 ± 0.06)×10-6 cm2 s-1 from CP analysis. However, a couple of following processes, i.e. alloy formation on the Ir electrode surface and interdiffusion were found to affect the kinetics of iron deposition. An ECC mechanism is proposed to account for the CV observations. The findings from this work confirm that zirconia-based solid electrolytes can play an important role in electrochemical fundamental research in high temperature molten slag electrolytes

    Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code

    Full text link
    This paper introduces Tiramisu, a polyhedral framework designed to generate high performance code for multiple platforms including multicores, GPUs, and distributed machines. Tiramisu introduces a scheduling language with novel extensions to explicitly manage the complexities that arise when targeting these systems. The framework is designed for the areas of image processing, stencils, linear algebra and deep learning. Tiramisu has two main features: it relies on a flexible representation based on the polyhedral model and it has a rich scheduling language allowing fine-grained control of optimizations. Tiramisu uses a four-level intermediate representation that allows full separation between the algorithms, loop transformations, data layouts, and communication. This separation simplifies targeting multiple hardware architectures with the same algorithm. We evaluate Tiramisu by writing a set of image processing, deep learning, and linear algebra benchmarks and compare them with state-of-the-art compilers and hand-tuned libraries. We show that Tiramisu matches or outperforms existing compilers and libraries on different hardware architectures, including multicore CPUs, GPUs, and distributed machines.Comment: arXiv admin note: substantial text overlap with arXiv:1803.0041
    • …
    corecore